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Abstract 



In this article, we review the empirieal evidenee on the impact of education vouchers on student 
achievement, and briefly discuss the evidenee from other forms of school choice. The best 
researeh to date finds relatively small achievement gains for students offered edueation vouchers, 
most of whieh are not statistieally different from zero. Further, what little evidenee exists 
regarding the potential for publie sehools to respond to inereased eompetitive pressure generated 
by vouchers suggests that one should remain wary that large improvements would result from a 
more eomprehensive voucher system. The evidenee from other forms of sehool choiee is also 
eonsistent with this eonelusion. Many questions remain unanswered, however, ineluding 
whether vouehers have longer-run impaets on outeomes sueh as graduation rates, eollege 
enrollment, or even future wages, and whether vouehers might nevertheless provide a eost- 
neutral alternative to our eurrent system of publie edueation provision at the elementary and 
seeondary sehool level. 
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Of the nearly 60 million sehool-aged ehildren in the U.S.: 87% attend a publie school, 
11% attend a private school, and approximately 2% are home-schooled. Fewer than 60,000 
currently participate in a publicly-funded school voucher program. Rather, the vast majority - 
nearly three-quarters of all students - attend their neighborhood public school (Tice, et al. 2006). 
And yet many complain that these traditional public schools do not educate children well, as 
evidenced by stagnant test scores and poor showings in international comparisons. What is not 
so clear is how to address these concerns. 

Convinced that schools need to manage their resources better, recent efforts have 
attempted to inject more accountability into the education sector. One approach is through “test- 
based” or “administratively-based” accountability in which students are regularly assessed and 
the results of these assessments are made public. The theory is that with more information about 
the performance of the local public schools, parents and administrators will demand a better 
product. A second - more controversial - approach is to increase accountability by increasing 
the educational choices available to parents. If the current system does, indeed, provide 
education to children inefficiently, then by increasing choice (which should induce competition), 
one can, theoretically, improve student achievement without significantly increasing public 
expenditures. 

Choice in the education sector can take many forms. Since school quality factors heavily 
into family residential decisions (e.g., Barrow 2002 and Black 1999), “residential choice” is the 
most prevalent form of public school choice. However, several programs increase school choice 
for families after they have decided where to live. Open enrollment programs - such as those in 
Charlotte -Mecklenburg, NC and Milwaukee, WI - ask parents to rank public schools within the 
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district and then assign children to schools according to parental preference. Magnet schools - 
which were largely developed in response to desegregation efforts - are typically specialized 
schools to which all district students are eligible to apply. More recently many students also 
have the option of applying to charter schools, which are publicly-funded but operate with 
greater autonomy than traditional public schools. Finally, since the early 1990s, several small- 
scale voucher programs have been started in the U.S. - some publicly financed and others 
privately financed. In this paper we focus on the evidence from education vouchers, one 
particular strategy for increasing competition in education provision and thereby accountability. 

As a market-level intervention, there are many important factors to consider when 
evaluating the potential impact of school vouchers on society, including their effect on student 
outcomes, school efficiency (including costs), and social stratification (both within and across 
schools and neighborhoods). Decoupling school finance from residential decisions would also 
likely impact housing markets and markets for education inputs, such as teachers. Unfortunately, 
a comprehensive treatment of all of these dimensions is beyond the scope of this review. 
Instead, we focus on the empirical evidence on the impact of education vouchers on student 
achievement, and briefly discuss evidence from other forms of school choice. Our discussion is 
limited to U.S. voucher programs since, theoretically, the (relative) effectiveness of such 
programs depends on the relative efficiency of the public sector schools as well as the existing 
competitive environment in education. For example, public elementary and secondary schooling 
in the U.S. has largely depended on local financing meaning that choice between local school 
districts may already generate strong competitive pressure. As a result, there may be less 
potential for vouchers to generate large efficiency gains (see, e.g., Barrow & Rouse 2004). A 
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less efficient public sector and a less competitive (public schooling) environment may explain 
the large impacts of school vouchers that have been estimated in other countries, such as 
Columbia (see, e.g., Angrist et al. 2002). 

After reviewing the empirical evidence from the U.S., we conclude that expectations 
about the ability of vouchers to substantially improve achievement for the students who use them 
should be tempered by the results of the studies to date. In addition, while not as extensive or 
compelling, the evidence of meaningful gains for those students who remain behind in the public 
schools is also weak. That said, many questions remain - for example, no studies have examined 
the longer-run impact of vouchers on outcomes such as graduation rates, college enrollment, and 
future wages. Further, the research designs for studying the potential impacts of vouchers on 
students who remain in the public schools are far from ideal. 

In the next section, we discuss the theoretical reasons why education vouchers should 
improve student achievement and then review the empirical approaches used for identifying such 
effects of vouchers. Next, we present the best evidence examining the impact of school vouchers 
on student achievement from existing studies of publicly- and privately-financed programs. We 
then briefly discuss evidence from other forms of school choice, consider other potential 
increases in social welfare, and finally conclude. 

WHY COMPETITION SHOULD IMPROVE THE EDUCATIONAL SYSTEM 

The idea of injecting competition into the public school system is not new; Milton 
Friedman (1962) argued for separating the financing and provision of public schooling by 
issuing vouchers redeemable for a maximum amount per child if spent on education. The basic 
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rationale behind school vouchers is that competitive markets allocate resources more efficiently 
than do monopolistic ones. Many observers argue that since children are assigned to attend their 
local neighborhood school, public schools in the U.S. have “monopoly” power. Once a family 
has decided where to live, they have significantly fewer publicly-funded schooling options. 
Parents can choose to send their child to private school, but that means paying for schooling 
twice: once through property taxes (for the public schooling they are not using) and again 
through private school tuition. If parents had more publicly-funded options, then schools would 
have to compete for students. More options might also increase (allocative) efficiency by 
improving the match between students and their educational interests and needs. Importantly, 
schools in this model would have an incentive to improve along the margins valued by parents. 
If parents select schools based on their academic quality, then schools would compete for 
students along such margins; if parents value religious education or sports, then one would 
expect to see schools respond along these margins.^ 

One of the challenges in considering the impact of education vouchers on student 
achievement in the abstract is that impacts likely hinge on the design of the voucher program in 



' Even if parents value “school quality” a related, yet poorly understood, issue is what is 
meant by school quality in practice. A school can have high levels of academic achievement not 
because the school is adding significant value to the students, but because the students it attracts 
are already high achieving. Thus, if parents select schools based on the average level of the 
academic outcomes of the students, then they are implicitly putting more weight on the peers 
their child would likely encounter while at that school rather than on the school’s ability to raise 
the achievement of a randomly selected child (irrespective of the average achievement of the 
child’s classmates). Rothstein (2006) finds evidence consistent with parents choosing schools 
based on the potential peer group offered by the school rather than a productive advantage. This 
finding suggests that school choice would not provide an incentive for public schools to improve 
along academic dimensions defined as value-added. In contrast, Hanushek et al. (2007) find 
evidence suggesting that parents put some weight on a school’s value added as well. 
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question. For example, two design features are the generosity of the voueher and whether 
parents ean “top up” the voueher to attend a sehool that charges tuition exceeding the voucher 
amount. In Friedman’s original conception, the government would pay an amount per pupil for 
schooling and this voucher could be used to pay part, or all, of the tuition at any “approved” 
school. Further, all students would be eligible for a voucher and there would be no government- 
imposed regulations on how the private schools select their students. By allowing “topping-up,” 
extending eligibility to all students, and not imposing restrictions on the admissions processes 
used by the private schools, all schools become “public” schools in Friedman’s voucher system 
as all accept public financing. In this model the scope for competitive pressure on local public 
schools is quite large. 

Another important design element is whether transportation is provided with the voucher. 
Currently, local public schools provide transportation to students and because most students are 
assigned to attend their neighborhood school, transportation costs are minimized. Whether a 
voucher program would also pay for transportation affects the viable schooling alternatives for 
parents, which would affect the level of competition. Paying for transportation increases the 
choices available to parents, but also increases costs. 

Far from Friedman’s ideal, the publicly-funded voucher programs in the U.S. to date 
require that the participating schools accept the voucher as the full, or a substantial portion of. 



2 

Friedman is silent on whether the voucher should be “flat” - where all students would 
receive the same amount, or “graduated” depending on family income or the child’s educational 
needs (e.g., special education or bilingual education). 

The primary government role in this conception is to impose “minimum” standards. 
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tuition (although they may be allowed to eharge additional “fees”); are limited to low-ineome 
students or to students attending very low-performing sehools; and require partieipating sehools 
to aceept all potential voueher students who apply or to randomly ehoose among them if 
oversubseribed. Most of the programs also provide some transportation for the partieipating 
students, partieularly if the private school is not located far from the student’s home. With the 
exception of the provision of transportation, the other features of these programs - their 
relatively small size and the restrictions on the private schools - likely dampen the potential for a 
substantial increase in the choices available to parents. 

Keeping these issues in mind and assuming that parents value academic quality in their 
child’s school, there are two hypothesized ways by which increased school choice would 
improve student educational outcomes. The first is a “direct” effect for those students who 
actually exercise choice. Assuming that students would only choose to attend a school other than 
their neighborhood school if the alternative were better (or a better match), then the academic 
achievement of students who opt for a different school should improve relative to what their 
performance would have been had they stayed in the public school. The second is a “systemic” 
or “general equilibrium” effect on students remaining in the public schools. Increased 
competition should induce the public schools to improve in an effort to attract (or retain) 
students. Not only should the achievement of those who choose to attend a private (or 
alternative) school increase, but so should the achievement of those who do not choose to leave 
as well. In other words, the increase in competition should also increase the efficiency of public 
schools. Of course, expansion of the private sector is a critical component of increasing 
competition. Without new school entry and/or increases in the size of current private schools. 
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vouchers would have limited ability to inerease ehoice. 

The faet that many empirieal studies find that students in private sehools have higher 
educational achievement levels than those in publie schools (see, e.g., Coleman et al. 1982a, 
1982b; Evans & Sehwab 1995; Neal 1997; and Altonji et al. 2005a) is presented by voucher 
advoeates as prima faeie evidenee that vouehers would improve student achievement for all. 
Namely, they argue that private sehools outperform public schools because their existence 
depends on providing a good produet. Educational vouchers are intended to make publie schools 
compete in this same way; thus, only sehools (either publie or private) providing a good product 
would survive. However, this literature is not eonelusive because of the diffieulty (deseribed 
below) in identifying a causal impact of private sehools on student aehievement. Not 
surprisingly, erities argue that the observed superiority of private sehools in these studies arises 
from omitted variables bias - the students who attend private sehools differ from the students 
who attend public schools - rather than differenees in the effectiveness of the sehools (see, e.g., 
Goldberger & Cain 1982, Cain & Goldberger 1983, and Altonji et al. 2005b). If this is the ease, 
the aehievement of eurrent publie school students would not necessarily improve in private 
sehools. 

While the debate continues whether private schools, in general, are better at edueating 
ehildren than publie schools, researchers have turned to more direct evidence on the impaet of 
vouehers by studying actual school voucher programs. 

EMPIRICAL APPROACHES TO STUDYING SCHOOL VOUCHERS 



Eeonomists typieally model sehool outcomes using an “education production function” 
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where schools produce education using inputs and a production technology. The effect of 
particular inputs on the output (e.g. educational achievement) can then be measured, usually for 
each student. As the theory behind school vouchers is silent on the source of the increased 
efficiency - whether it arises from differential use (production technology) and/or the level of 
inputs - the empirical research has simply attempted to study whether educational outcomes in 
the presence of vouchers are better than educational outcomes in the absence of vouchers. 

Strategies to Estimate the Direct Impact of Vouchers 

To study the direct impact of vouchers on the students who use them, analysts have 
estimated versions of the following education production function: 

Eit +^i/t ( 1 ) 

where represents the output for student i in year t; Yu represents whether student i used a 
voucher in year t; X, represents observable student characteristics (such as sex and race); and s,y 
is an error term that represents all the other factors affecting achievement but not observed by the 
researcher. (One could also constrain the impact of the voucher program to be linear, in which 
case the independent variable would be the number of years since the student was eligible for, or 
actually enrolled in, a voucher program.) Note here that Yu proxies for the bundle of inputs and 
production technologies that make a school a “school” - including the peer group, which may 
not be under the control of a school (especially a public school). 

A positive coefficient on Yu (Pt>0) suggests that students who used a voucher had better 
educational outcomes than those who did not. Further, researchers typically infer that the impact 
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of using a voucher (the eoeffieient on V,y) derives from differenees in the effeetiveness of the 
sehools attended by voueher students eompared to those who opted to remain in a publie sehool. 
The problem in estimating equation (1) (and its variants) is that the non-voueher students may 
not provide a valid eounterfaetual to the voueher students. In partieular, voueher use and 
edueational outeomes may be endogenous sueh that E[sit|Vit,Xi] 4 - 0. For example, students with 
parents who are very edueationally foeused and motivated may be more likely to apply for a 
sehool voucher, yet these students may have done better than their non-voueher elassmates even 
in the absenee of the voueher program. Unless these non-sehool inputs are fully observable to the 
researeher sueh that she is able to eontrol for them, the estimated impaet of vouehers (Pt) will 
likely be biased. As a result, to estimate the effeet of vouehers on school outputs, researchers 
have relied on analytieal strategies that adequately eontrol for differenees between the two 
groups of students. 

One of the most eommon strategies is to assume that a student’s prior test seore(s), in 
year t-1 (or earlier), refleets her innate “ability” or motivation, as well as the aeeumulation of the 
sehooling inputs she has received up to year t. Under this assumption, researehers typieally 
estimate eurrent aehievement as: 



£„=a' + l,£„_,+/?,V„ + X/+4, 



(2) 



where Ejt.i refleets prior aehievement (typieally a test seore). The identifying assumption is that 
Eit-i fully proxies for the inputs that affect a student’s aehievement prior to using a voueher and 
are also eorrelated with a student’s likelihood of using a voueher such that E(8it'|Vit,Xi,Eit.i) = 0. 
Clearly, this is a strong assumption as test scores are, at best, a noisy measure of student ability. 
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Additionally, one must assume that this prior test seore (or a series of test seores) also eaptures 
the unmeasured eharaeteristies that led some students to apply for a voueher program (or that 
determine eligibility) while others did not. 

If Eit-i was obtained before the student was seleeted for, or enrolled in, the voueher 
program, this strategy amounts to eomparing the average change - or gain - in student 
achievement from before to after participation in the voucher program to the change in the test 
scores of students in the comparison group (the non-voucher students). In the case where the 
researcher observes outcomes for multiple periods, controlling for Eit_i for periods t>l will only 
allow one to detect Pt'>0 in the case where the yearly achievement gains of students in the 
voucher program are consistently higher than those of students in the comparison group. If there 
is an initial gain by voucher students followed by a plateau, the estimate of Pt' obtained in 
equation (2) will potentially understate the impact of the voucher program since the early impact 
will have been effectively absorbed by Eit_i. 

A variant of equation (2) is to control for all time-invariant student characteristics by 
including student fixed effects. In this case Pt' is identified only off of students switching 
voucher status. The assumption that must hold for the fixed-effects estimate to generate an 
unbiased effect of vouchers is that there are no unobserved time-varying differences between the 
two groups of students that would explain changes in the test scores, except for use of a voucher. 
While appealing, one might be concerned that there are remaining unobserved differences 
affecting both the likelihood of using a voucher and educational achievement. 

Perhaps the most compelling strategy to generate causal estimates of the effect of 
vouchers on student outcomes is through the use of a random assignment design. In this 
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“experimental” researeh design, students are randomly assigned to either a “treatment group” 
that is offered a sehool voueher or a “eontrol group” that is not. In this ease, there are no 
differenees in the observed or unobserved non-school inputs, on average, between the two 
groups because the offer of a voucher was not determined by family income or one’s motivation, 
but rather by the “flip of a coin.” Thus, the typical empirical specification is, 

^iit = + ^i^it + ^i/t + (Pi+ ^'iit ’ ( 3 ) 

where Sit indicates whether a student was selected for (or offered) a voucher and (pi represents the 
lottery (1) in which the student actually participated. Thus, by construction E[s"iit|Sit,(pi] = 0 for 
students taking part in the randomization. Note that if the randomization is properly 
implemented, then one need not condition on other student characteristics (Xi) in order for the 
estimate of 0t to be unbiased, although researchers will occasionally do so to gain efficiency in 
the standard errors. 

Ordinary least squares (OLS) estimation of equation (3) should generate an unbiased 
estimate of the impact of offering vouchers on student outcomes in a particular year (0t), a 
parameter known as the “intent-to-treat” effect. This impact reflects two parameters that are 
important for evaluating a voucher program; the rate at which students actually take-up vouchers 
and the relative achievement of students in private schools. As such, the intent-to-treat 
parameter has two appealing properties: it is the only unambiguously unbiased estimate that one 
can obtain using typical statistical methods such as OLS regression, and it reflects the overall 
potential gains from offering the vouchers as a policy, because it combines take-up with the 
relative gains for those who actually use the voucher. 
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Many are also interested in the effeet of “treatment-on-the-treated” - whether students 
who aetually use a voueher experienee aeademie gains as a result. Beeause aetual use of a 
voueher is not randomly determined, analysts must resort to non-experimental methods to 
generate eonsistent estimates of the effeet of treatment-on-the-treated. A eommon approaeh is to 
use an instrumental variables strategy in which whether a student was randomly offered a 
voucher is used as an instrumental variable for the student attending a private school. This type 
of analysis generates a consistent estimate of whether the schools attended by voucher students 
were more, less, or equally as effective as the schools attended by the non-voucher students."^ 

Properly implemented, a randomized design is viewed as the “gold standard” for 
estimating a causal relationship between vouchers and student outcomes. In practice however, 
non-random differences can emerge between the treatment and control groups. For example, 
often researchers conducting the study are not able to collect follow-up data on every study 
student potentially introducing non-random selection into the analysis. In addition, to the extent 
there are heterogeneous treatment impacts, the estimated impact of vouchers on student 
outcomes from one or two small studies may not represent the effect for a different group of 
students. 

Strategies to Estimate Public School Responses to Competitive Pressure 

The empirical strategies discussed above are designed to generate estimates of the direct 

^ Technically speaking, an instrumental variables analysis would generate a consistent 
estimate of the impact of attending a private school for those students who were induced to 
attend the private school only because of the voucher, an estimator known as the “Local Average 
Treatment Effect” (LATE) (Angrist & Imbens (1994); Angrist et al. (1996)). 




13 



impact of vouchers on student achievement. However, the true prize of a voueher system - or of 
any program designed to significantly increase the competitive pressure experieneed by publie 
schools - is overall improvement in the performanee of the U.S. edueation system. 
Unfortunately, developing a study that would generate unbiased estimates of any sueh systemie 
impaets is extremely diffieult.^ The problem is that, in theory, the publie sehools should improve 
in response to the inereased eompetition and thus increase the aehievement of the public school 
students as well. As a result, the publie school students do not represent what would have 
happened to the voucher students in the absence of the voueher program, so a simple comparison 
of the outeomes of students who use a voueher (or who were offered a voueher) to the outeomes 
of students who remained in the publie sehools (either by ehoice or beeause of “bad luck” in a 
lottery) would likely underestimate the general equilibrium impaet. 

Instead, one must first identify the relevant “market” for sehooling within whieh a sehool 
exists. The key is that the unit of observation for this study is not the individual student, but the 
market. Ideally one would randomly assign some markets to a treatment group - where the 
students would be eligible for sehool vouehers - and randomly assign the remaining markets to a 
eontrol group - where there would be no vouehers. After a period of time, the researeher would 



^ Due to the diffieulty of obtaining evidence on the impacts of a large-scale voueher 
program, a theoretical literature appeals to eomputable general equilibrium models to understand 
broader implieations of vouehers, sueh as the impact on student sorting and residential 
segregation. For example, Epple & Romano (1998) foeus on the impact of vouchers on student 
stratification and Nechbya (1999, 2000, 2003) considers the impact of different voucher sehemes 
on residential mobility and segregation. Ultimately, though, the potential for sehool vouehers to 
improve student aehievement in these models hinges on the relative impaet of private schools vs. 
public schools on student aehievement and/or on the response of publie schools to increased 
eompetition (see, e.g., Epple & Romano (1998, 2003) and Neehyba (2003)). 
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then compare the average outcomes of students in the voucher markets to those of students in the 
control markets. A simple comparison of student outcomes would yield an unbiased estimate of 
the general equilibrium impact of vouchers since, on average, the markets would have been 
similar ex-ante. While such an experiment is possible in theory, in practice it would be 
extremely difficult to implement primarily because it would require the coordination and 
cooperation of so many different stakeholders. As a result, researchers have turned to other 
research designs to try to generate a causal estimate of the impact of a large-scale voucher 
program. 

One approach that researchers have used is to model student achievement in existing 
public schools as a function of the competitive pressure experienced by the student’s school, 
school district, or metropolitan area. If public school student achievement improves, the 
assumption is that it is due to a response by the existing public schools to the increased 
competitive pressure. As such researchers have estimated versions of the following equation: 

^idt = ^ + ^t^dt + ^iS, + ^idt ’ ( 4 ) 

where d indexes the area (the school district, metropolitan area, or geographic area around a 
particular school), and Hdt is a measure of the competitive pressure faced by the school - such as 
the metropolitan-level Herfindahl-Hirschman Index^, the number of schools within a particular 
radius of an existing public school, or the school’s likely exposure to competitive pressure 



^ A Herfindahl-Hirschman Index based on the concentration of enrollment in a 
geographic area is meant to proxy for the market power of public schools in the area and 
therefore the degree of “choice” that parents may have. 
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because of the eligibility rules of a voucher program.^ As before, the challenge is to identify 
districts (or metropolitan areas) facing little competitive pressure that can serve as valid 
comparisons to those facing increased competitive pressure. One strategy that has been used to 
address this endogeneity is to employ an instrumental variable that is correlated with the 
endogenous variable (the level of competitive pressure), but not correlated with the error term in 
the achievement equation. 

Another strategy is to exploit non-linearities in voucher eligibility in an approach known 
as “regression discontinuity.” In this case, voucher eligibility is represented by a simple rule, 

v,=n^k,<k^ 

Vi^ = 0 otherwise, (5) 

where kts is the characteristic (or an index measure of characteristics) on which eligibility is 
determined (in this example i indexes the individual and s the school) and k* is the cutoff for 
eligibility. To date, this strategy has been used when students attending schools identified as 
chronically “failing” according to Florida’s school accountability system were eligible for a 
voucher to attend participating private schools or a higher-rated public school. The school’s 
“accountability points” clearly determined voucher eligibility and likely had an independent 
effect on student achievement (both because the school was failing and because students 
attending failing schools were more likely to come from disadvantaged families). However, 
students in schools earning just below the accountability point cutoff were arguably quite similar 

’ Note that although the unit of observation is the “market” (e.g., school district or 
metropolitan area), analysts often employ data on individual students. In this case, they must be 
careful to adjust the estimated standard errors to account for clustering of students within the 
same “market.” 
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to students in schools earning just above the accountability point cutoff. Thus, researchers 
identify the causal effect of voucher eligibility (and voucher “threat”) on student achievement 
(for students in the vicinity of the eligibility cutoff) by comparing the average educational 
achievement of students in schools just below the accountability point cutoff to the average 
educational achievement of students in schools just above the accountability point cutoff In 
practice, researchers have estimated: 

= a' + <■(*„) + /?V,„ + Xj:+ C (6) 

where c(kis) is a polynomial in the characteristic on which eligibility is determined. To the extent 
that schools can manipulate their accountability points to affect their identification as “failing,” 
the assumption that students attending schools on either side of the accountability point cutoff 
are otherwise quite similar is less compelling. A more general concern about estimates derived 
from regression discontinuity designs is that while they may generate unbiased estimates of the 
impact of a policy for schools near the cut-off point, in the presence of heterogeneous treatment 
effects, these impacts may not generalize to other schools. 

EVIDENCE ON SCHOOL VOUCHERS AND OTHER FORMS OF CHOICE 
Do Students Who Use School Vouchers Benefit? 

In the U.S. two types of school voucher programs have been studied: those financed by 
the government (publicly-funded school vouchers) and those provided by the private sector 
(privately-funded school vouchers). From a public policy perspective, the evidence from 
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publicly-funded programs is more relevant as they ineorporate some of the design features that 
might be built into a larger sehool voueher program, such as limitations on which students are 
eligible to reeeive a voueher and whether transportation is provided or reimbursed. That said, 
some of the most eompelling evidenee from a methodologieal perspeetive eomes from the 
privately-funded vouehers, so we review that evidenee here as well. Note that beeause this 
literature essentially eompares the performanee of students in private sehools to that of students 
in publie sehools, it bears striking similarity to that on differential effeetiveness of private and 
publie sehools. 

In Table 1 we present a summary of selected findings from publicly-funded voucher 
programs with formal evaluations. All of the estimates are eonverted to “effeet sizes” (i.e., the 
impaet divided by the standard deviation of the test distribution) normalized by the national 
standard deviation so that the implied magnitudes of the effeets are not affeeted by the standard 
deviation of the subgroup within each study. As such, these impaets ean be interpreted as 
proportions of a national standard deviation. As a benehmark for judging the magnitude of the 
impaets. Hill et al. (2007) review effeet sizes from many studies of edueational interventions. 
While they eaution it is only valid to eompare effeet sizes using eomparable populations, 
eontexts and interventions, and outcomes being measured, they report an average estimated 
effeet size of approximately 0.2a for studies involving elementary sehool ehildren. 

Launched in the early 1990s, the Milwaukee Parental Choice Program is one of the oldest 
publiely-funded voueher programs in the U.S. The program is open to low-ineome students who 
are eligible to receive a voueher to attend any partieipating sehool (ineluding religious schools) 
worth approximately $6,501 in the 2007-2008 aeademie year. Nearly 19,000 students and 120 
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schools participated that year. 

Early studies evaluating potential aehievement impaets of the program were eondueted 
when the program had only been in operation for about four years and vouehers eould only be 
used at non-religious sehools. At that time, about 12 sehools and 800 students partieipated. 
Beeause the partieipating schools in the program were required to take all students who applied 
or to randomly seleet among applieants in the event of over-subseription, researehers had two 
potential eomparison groups: unsuceessful applieants and a random sample of low-ineome 
students from the Milwaukee Publie Sehools. Using both eomparison groups, Rouse (1998) 
reports mixed results of the direet effect of the program. She estimates intent-to-treat effect sizes 
in the yearly gain of being seleeted for the program ranging from 0.06 to 0.1 la in math and from 
-0.03 to 0.03a in reading, although the impaets in reading are never statistieally different from 

o 

zero. The estimated yearly gain for those who aetually use a voueher in math is 0.14a while 
that in reading is only 0.01a (and not statistieally different from zero). 

Evidenee from the Cleveland Seholarship and Tutoring Program (CSTP) suggests even 
smaller impacts on student outeomes. This voucher program is open to all students living within 
the boundaries of the Cleveland Metropolitan Sehool Distriet with preferenee given to students 



The range refleets estimates from different model specifieations. Other studies using 
these early Milwaukee data inelude Witte et al. (1995), Witte (1997), and Greene et al. (1999). 
Using only the sample of low-ineome students from the Milwaukee Publie Sehools as a 
eomparison group, Witte et al. (1995) and Witte (1997) estimate no impaet of the program on 
student aehievement. Greene et al. (1999) only use the unsuceessful applicants as a comparison 
group and estimate a positive impaet in both math and reading. See Rouse (1998) for further 
diseussion of the differenees between the studies. 
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in low-income families.^ Students are permitted to use the vouchers at both non-sectarian and 
sectarian schools. (The tutoring program provides tutors to interested students from 
kindergarten through twelfth grade.) As vouchers are (theoretically) allocated using a lottery, the 
CSTP program data allow researehers to identify two groups of applicants: voucher reeipients 
and non-recipients. Additionally, test scores and some longitudinal Cleveland Municipal School 
District data are available for the first grade elassmates of voueher reeipients who did not use 
their voucher as well as the first grade classmates of program applicants who did not use their 
voueher or were not awarded a voueher, generating a (non-random) public school sample for 
eomparison (Metcalf 2001). 

Table 1 shows estimates from the eohort of students who entered kindergarten in 1997. 
The intent-to-treat estimates compare voucher winners to rejeeted applicants while the treatment- 
on-the-treated estimates compare voueher users to rejected applicants. The specifications 
include the student’s test seore from the previous year such that the results refieet the one -year 
ehange in test scores rather than the cumulative impact of the voucher program. After three years 
(when the students were in grade), the test seore gain for voucher recipients was significantly 
lower in math and reading than for applicants who were not offered a voueher. The estimated 
gains for voueher users were also negative and statistically significant. After five years (when 
the students were in 4* grade), the gains for those offered a voueher were lower in math but 
higher in reading than those for non-reeipients, although neither impaet is statistieally different 

^ The voueher is progressive in that it pays 90 pereent of tuition up to $3,450 for those 
with family income below 200 pereent of the poverty line and 75 percent of tuition up to a 
maximum of $3,450 for those from families earning above 200 percent of the poverty line. The 
original program paid tuition up to a maximum of $2,250 (Metcalf et al. 1998). 
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from zero. Similarly, voucher users had lower gains than applicant/non-recipients in math but 
higher gains in reading; again, neither impact is statistically different from zero. 

While the studies from both Milwaukee and Cleveland attempt to construct valid 
comparison groups to generate causal impacts of the programs on student outcomes, they rely on 
observational data and therefore may be subject to omitted variables bias. In the case of 
Milwaukee, the bias could either be positive (in that the students who participated in the program 
were more motivated) or negative (in that the random sample of low-income students in the 
public schools were too advantaged relative to the voucher participants). While Rouse (1998) 
attempts to determine the extent of any such bias (and concludes it is likely minimal), it remains 
an untestable assumption. Belfield (2007) is subject to the same general concern because he 
does not observe the actual lotteries in which students participated and because the unsuccessful 
applicants may be more advantaged than lottery winners since preference was given to low- 
income families.^*’ 

This methodological concern can, in theory, be addressed with the relatively new D.C. 
Opportunity Scholarship Program (DCOSP) in Washington, D.C. which is being evaluated using 



The estimates would be biased if a student’s likelihood of winning a voucher varies 
across lotteries and participation in a specific lottery is correlated with student characteristics that 
also determine achievement. Further, it is not clear whether the non-recipient group also contains 
students who were not entered into the lottery due to the preference given to students from low- 
income families as suggested by Metcalf (2001). In personal email correspondence, evaluators 
of the program believe these more economically advantaged students were always part of the 
lottery. Finally, we note that Belfield (2007) includes some measures in his empirical 
specifications - such as class size and teacher’s years of experience - that are arguably outcomes 
of the voucher program; however, his results are robust to excluding these measures. 
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a random assignment program design/' In the first two years of the program (spring 2004 and 
2005), 2,038 eligible publie sehool students partieipated in lotteries: 1,387 were awarded a 
seholarship and the remaining 921 students became the control group. Wolf et al. (2007) 
estimate that after one year, intent-to-treat effect sizes for the first two cohorts of students ranged 
from -0.01 to 0.07a in math and from -0.01 to 0.03a in reading. After two years. Wolf et al. 
(2008) report that the impacts ranged from -0.02 to 0.01a in math and from 0.05 to 0.08a in 
reading. Not only do these ranges include negative impacts, but none are statistically different 
from zero at the 5% level. 

To date, the evidence from publicly-funded voucher programs suggests, at best, mixed 
improvement among those students who were either selected for a voucher (the intent-to-treat) or 
who used one (the treatment-on- the-treated). The largest estimates, from the Milwaukee Parental 
Choice Program, suggest potential gains in the intent to treat of 0.1 la in math and gains of 0.14a 
for those who actually attend a private school; most of the other estimates are much smaller or 
even negative. However, with the exception of the program in Washington, D.C., the studies 
suffer from potentially unsatisfactory comparison groups. As such, we now turn to evidence 
from the privately-funded programs. 

Although a recent U.S. General Accounting Office (2002) report found 78 privately- 
funded voucher programs to review, only a handful have been subject to any evaluation. Three 
privately-funded voucher programs - New York City; Dayton, OH; and Washington, D.C. - had 



" See Wolf et al. (2007, 2008) for more details. Students attending low-performing 
public schools were given a better chance of winning the lottery. Although private school 
students were eligible for the vouchers, they were excluded from the evaluation. 
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randomized study designs making them the best-suited for rigorous evaluation. As in the 
DCOSP, eaeh program had greater numbers of applieants than vouehers available so applieants 
eould be randomly seleeted to reeeive a voucher offer. In New York City, the number of 
applicants was so large that the “control” group is comprised of a sample of applicants not 
selected to receive a voucher. 

As shown in Table 2, both Mayer et al. (2002) and Krueger & Zhu (2004) report small, 
statistically insignificant impacts of offering vouchers when analyzing all students. Further, after 
three years the estimated impact of attending a private school is at most 0.05a, although even this 
estimate (for the New York City program) is not statistically different from zero. 

A widely publicized result from these programs is that there may have been differences 
across subgroups of students. Indeed, Howell & Peterson (2002) and Mayer et al. (2002) report 
statistically significant positive effects of private school attendance on test scores for African 
American students alone (See Table 2). For New York City and Washington, D.C. combined, 
after three years African American students who used a voucher are estimated to have 
experienced a 0.23a gain in achievement; those in New York City are estimated to have gained 
0.26a. (In contrast, Howell et al. (2002) estimate a negative impact for African American 
students after three years in Washington, D.C. although the impact is not statistically significant 
from zero.) 

However, the estimated positive impact on African American students is not robust. In 
reanalyzing the data from New York City, Krueger & Zhu (2004) report that the results by race 
are particularly sensitive to two analytical decisions. First, Krueger & Zhu (2004) include all 
students, whereas Mayer et al. (2002) include baseline test scores in all of their specifications 
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leading them to exclude all students missing baseline test score information, most of whom are 
first grade students who were not administered a baseline test. As noted earlier, because students 
were randomly chosen to receive or not receive a voucher, baseline characteristics such as test 
scores should have been identical for the two groups, on average. The primary reason for 
including baseline characteristics is to improve the precision of the estimates. However, Krueger 
& Zhu (2004) find very little difference in the precision of the estimated impact of vouchers 
using a larger sample excluding baseline test scores compared to using the smaller sample with 
baseline test scores. As a result, they argue that the gain in terms of statistical precision is not 
large enough to warrant the cost of not generating estimates that are representative of the original 
target population. 

The second substantive difference between the studies is how the researchers identify a 
student’s race. Mayer et al. (2002) identify a student as African American if the mother’s race is 
reported as African American (non-Hispanic) irrespective of the race or ethnicity of the father. 
Krueger & Zhu (2004) use alternative identifications such as whether either parent is African 
American (non-Hispanic) or including the group of students whose parents responded “Other” to 
the survey, but indicated they (the parents) were Black in the open-ended response. With the 
larger sample and the broadest identification of students as African American, they report that 
the estimated intent- to -treat impact falls to 0.05a after three years and the estimated treatment on 
the treated impact falls to 0.03a. 

In sum, there is little evidence of overall improvement in test scores for students offered 
an education voucher from privately-funded voucher programs. Although there is some evidence 
that African American students benefit from being offered a voucher in the New York City 
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study, the evidence is not robust to sensible alternative ways of constructing the analysis sample. 

Do Students Who Remain in Public Schools Benefit? 

The voucher studies discussed above are based on relatively small voucher programs 
where there was unlikely a sufficient increase in competitive pressure to elicit a public sector 
response. The estimates, therefore, reflect the direct effect of vouchers for those offered or using 
them. Researchers have attempted to glean whether public school students would potentially 
benefit from a large-scale program using evidence from two existing publicly-funded voucher 
programs. 

When the experimental phase of the Milwaukee Parental Choice Program ended in 1995, 
the program was expanded to allow for a maximum of 15% of the public school enrollment. 
Further, the Wisconsin Supreme Court ruled in 1998 that the vouchers could be used in religious 
schools as well. These two events led to a dramatic increase in program participation by both 
students and schools. In fact, the program was so popular that participation was expanded to a 
maximum of 22,500 voucher students in 2006. Researchers have attempted to analyze these last 
two expansions to estimate the potential impact of a large-scale voucher program on student 
achievement in the public sector (see Hoxby (2003), Carnoy et al. (2007), and Chakrabarti 
(2008)). While some of the details differ, the basic strategy of all three studies is to attempt to 
identify those schools within the Milwaukee Public School District that face more or less 
competitive pressure due to the income-level of the students (Those schools with a high 
proportion of low-income students who are eligible for the voucher program presumably face 
more competitive pressure than those with a low proportion of low-income students.), as well as 
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to identify observably comparable districts elsewhere in Wisconsin. Disproportionate gains 
among students attending schools within Milwaukee facing competitive pressure compared to 
schools within Milwaukee facing little pressure and districts outside of Milwaukee facing no 
voucher pressure would be evidence of a positive impact of competition on school efficiency (as 
reflected in student test scores). 

As summarized in table 3, all three studies find evidence that with the expansion of the 
voucher program in 1998, student performance improved in the first few years, especially in 
schools that were most likely to be affected by the increased competition. For example, Hoxby 
(2003) estimates that the 4**^ grade test scores of students attending schools likely facing the most 
competitive pressure improved by 0.12a per year in math and by 0.11a per year in language 
relative to students attending comparison schools outside of Milwaukee. 

While interesting, these results must be interpreted as suggestive. First, the identifying 
assumption is that there are no unobserved changes from before to after the voucher program 
between the “treated schools” and the “comparison” schools. While certainly possible, it 
remains a strong identifying assumption, especially since within the Milwaukee Public School 
District all schools were potentially “treated” and outside of Milwaukee the demographic 
composition of the schools is quite different (specifically the students come from wealthier 
families and are less likely to be minorities). Second, Carnoy et al. (2007) present additional 
results that are not consistent with a simple interpretation that performance in the Milwaukee 
Public Schools improved due to increased competition. For example, as evidenced by a 
comparison of rows (2) and (3) in table 3, they find there was no additional improvement after 
2002 despite the fact that interest in the voucher program increased (as proxied by the number of 




26 



applications). Further, they find no evidenee of a general equilibrium impaet when they employ 
other direet measures of eompetition; there are no positive aehievement gains for students as the 

number of nearby voucher schools increases or as the number of applieations from a school 

12 

increases (rows (4) and (5) of table 3). 

In order for a voucher program to spur improvement within the public schools, there need 
not be a substantial number (or proportion) of students who use a voueher to attend a private 
sehool. Rather, if publie school administrators perceive there is a threat that the students will do 
so, they may have an ineentive to respond by improving sehool quality. Thus, an alternative way 
to gain insight into the potential response of publie sehools to inereased eompetitive pressure is 
to study the sehooling outeomes of students attending sehools that were under the threat of 
beeoming voueher-eligible. Researehers have done this by taking advantage of the design of 
Florida’s sehool aeeountability system; Florida’s A+ Plan for Edueation. Speeilioally, sinee 
1999 sehools in Florida are given a grade of A-F largely dependent on the performanee of the 
students. Sehools receiving high grades or improving seores reeeive bonuses, while low 
performing sehools (graded either “D” or “F”) are subjeet to inereased administrative oversight 
and are provided with additional linanoial assistanee. Further, if a sehool reeeived an “F” in two 
out of four years and has an “F” in the eurrent year, students beeome eligible for vouehers ealled 
Opportunity Seholarships. While other features of the A+ Plan remain in effeet, the voueher 
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Their results are quite similar when they limit the analysis to predominantly Afriean 
Ameriean schools. 

1 3 

Currently Florida has two other voucher programs as well; an ineome tax credit for 
eorporations to fund vouehers for low-ineome students and the MeKay Seholarship for students 
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program was declared unconstitutional by the Florida Supreme Court in January 2006. 
Thereafter students could no longer use a voucher to attend a participating private school, but 
could still use a voucher to attend a higher-graded public school. 

Under the Florida A+ Plan, school grades are determined by assigning “grade points” 
based on student test score performance.'"^ Grades are then assigned based on whether the school 
is above or below the pre-determined cut points for each of the letter grades. Arguably, schools 
receiving just enough grade points to earn a grade of “D” are no different than schools earning 
just below the number of grade points needed to earn a grade of “D.” As a result, the schools 
that received an “F” grade are quite similar to those that received a “D” grade along many 
dimensions. Figlio & Rouse (2006), West & Peterson (2006), Chiang (2008), and Rouse et al. 
(2007) therefore compare student outcomes from schools earning “D” and “F” grades while 
controlling for the number of grade points earned in an effort to recover the causal effect of the 
policy on educational achievement. 

All of the papers find that the test scores of students improve following a school’s receipt 
of an “F” grade. For example, as shown in table 4, Chiang (2008) and Rouse et al. (2007) report 
one-year gains ranging from 0.12 to 0.21a in math and from 0.11 to 0.14a in reading. Further, 
Chiang (2008) and Rouse et al. (2007) find evidence that the improvements persist for at least 



with disabilities. Greene & Winters (2008) study the impact of the McKay Scholarships on the 
achievement gains of students with disabilities who remain in the public schools. Because their 
estimation strategy identifies the general effect of vouchers using students whose disability status 
changes, it is unclear the extent to which these results generalize to overall improvements in the 
public schools. 

Literally speaking, school grades were not assigned using grade points before 2002 
when Figlio & Rouse (2006) study the system. Nevertheless, their strategy is similar in spirit. 
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three years, even onee the students leave the voueher-threatened school.'^ As sueh, these studies 
may provide some evidenee that inereased eompetitive pressure ean generate improvement in 
publie schools.'^ 

However, the F-graded sehools in Florida were also stigmatized as “failing” (one of the 
intents of the publie announeements of the grades). As such, one cannot strictly identify a 
“voucher effect” from a “stigma effect” where under a stigma effect the school administrators 
and teachers are not motivated to improve because of perceived increased competition, but 
because the label “failing school” generates a significant loss of utility.'^ Figlio & Rouse (2006) 
indirectly assess the impact of stigma by comparing student achievement following the 
implementation of the A+ Plan (which enlisted both the threat of vouchers and stigma ) with 
student achievement following the placement of schools on a critically low performers list in 
1996, 1997, and 1998 that involved public stigma, but no threat of vouchers. They estimate that 



In addition. Rouse et al. (2007) report finding evidence that the F-graded schools 
responded in educationally-meaningful ways. For example, following receipt of an F-grade, 
schools were more likely to focus on low-performing students, lengthen the amount of time 
devoted to instruction, and increase resources available to teachers. 

A statistical issue with which all of the authors wrestle is whether the disproportionate 
gains by students in the F-graded schools was due to mean-reverting measurement error or 
reflected actual changes in response to the A+ Plan. Mean-reverting measurement error occurs 
when gains the year after a school scores unusually low - and is thereby labeled as “F” - reflect 
the measurement error in test scores. That is, the test scores of students might have increased in 
many of the “F” schools even in the absence of the A+ Plan simply because they were 
transitorily low in the prior year. The reliance on a regression discontinuity design helps to 
mitigate against the presence of mean-reverting measurement error, although the authors employ 
other strategies as well. 

Given that school principals and teachers have chosen their profession out of a desire 
to teach children, such a loss of utility might stem from loss of “identity utility” (Akerlof & 
Kranton 2005, 2007) or out of fear of loss of standing in the wider community. 
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the student gains in reading were nearly identieal under the two regimes and were aetually larger 
in math following plaeement on the eritieally low performers list, suggesting that the relative 
improvements among the low-performing sehools may have been due more to stigma than to the 
threat of vouehers. 

In sum, while the expansion of the Milwaukee Parental Choiee Program and the threat of 
vouehers ereated by the Florida A+ Plan provide some evidenee that student aehievement 
improves in sehools facing increased competition, the research strategies do not allow one to rule 
out other explanations for the improvements. As such, we conclude there is no conclusive 
support for the potential for vouchers to spur public schools to improve. 

Do Students Benefit from Other Forms of School Choice? 

As noted previously, school vouchers are not the only mechanism for broadening the 
publicly-funded schooling choices available to families. School districts have operated magnet 
schools and implemented open enrollment plans for decades, and more recently, families have 
had the option of charter schools as well. As such, estimates of the effects of other forms of 
choice on student achievement may provide additional evidence on the potential gains from 
private school vouchers. While a full review of the evidence from these other forms of school 
choice is beyond the scope of this paper, we briefly review some of it. 

Charter schools are probably the closest analog within the public sector to private 
schools. While their administrative organization and regulation varies tremendously from state 
to state, they are publicly funded and typically have more autonomy than traditional public 
schools. Importantly, children are not assigned to attend charter schools - they only attend 
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through the active choice of their parents. As a way of increasing choice within the public 
sector, they have become increasingly popular: while there were only two charter schools in 
operation in the U.S. in 1992 (Bettinger 2005), by 2007 approximately 1,200,000 students 
attended over 4,100 such schools (http://edreform.com). 

Not surprisingly, researchers have begun to find ways to evaluate whether charter schools 
generate better student outcomes than traditional public schools. Constrained by a dearth of data 
on individual students, early studies usually relied on test scores at the school level - often from 
Michigan, an early adopter of charter schools (see, e.g., Eberts & Hollenbeck 2002, Bettinger 
2005). These papers typically find that the achievement of students in charter schools is no 
greater than that in traditional public schools. Clearly a challenge with school-level data, 
however, is in accounting for the characteristics of the students taking the exam. As a result, 
more recent studies have been based on student-level data using two general approaches. 

The first approach has been to use state-wide student-level data (available from Florida, 
North Carolina, and Texas) and to control for time-invariant student characteristics using 
individual-level fixed effects (see Sass 2006, Bifulco & Ladd 2006, and Hanushek et al. 2007). 
These studies identify the effect of charter schools by comparing a student’s achievement in a 
charter school to his or her achievement in a traditional public school. If there are time-varying 
differences between students (which there could be since, for example, students might decide to 
change schools because they started to do poorly in their original school) then the estimates will 
be biased. That said, all three papers estimate slight negative impacts of charter schools on 
student achievement gains. There is some evidence, however, that the negative impacts decrease 
the longer the charter school has been in operation such that after 4-5 years students in charter 
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schools have similar achievement gains to those in traditional public schools. 

A second approach takes advantage of the fact that most charter schools must admit all 
students who apply or hold a lottery if oversubscribed. These lotteries therefore mimic a random 
assignment design. Hoxby & Rockoff (2004) and Hoxby & Murarka (2007) implement this 
design using data from Chicago and New York City. The results are mixed; the evidence from 
Chicago suggests no overall gains for students attending charter schools while that from New 
York City suggests small yearly test score gains. Overall the weight of the evidence thus far 
does not suggest that charter schools are much more effective than traditional public schools; 
however, these schools are relatively new and their effectiveness may improve with age. 

Open enrollment or district-wide school choice - in which students are not assigned to 
their neighborhood school, but can choose a school within the district - provides another way to 
generate evidence on whether student achievement improves when students actively choose. In 
these systems, students typically rank a number of schools and then are matched to schools 
according to an algorithm. While certain preferences may be built into the selection process - 
such as for siblings and proximity - these systems often include a lottery. Several recent papers 
take advantage of the fact that some students are randomly allocated to their school of choice to 
estimate whether the achievement of lottery “winners” improves relative to the achievement of 
those who “lose” the lottery. For example, Cullen et al. (2006) and Cullen & Jacob (2007) 
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Magnet schools are another form of choice within public schools, however their typical 
administration does not lend itself to rigorous analysis. In many districts, magnet schools 
specialize in a particular subject (e.g., music, science, computers, foreign language) and the 
schools are not obligated to randomly select students if oversubscribed. As such, it is difficult to 
obtain statistically unbiased estimates of their effectiveness. 
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exploit randomized lotteries among high sehools and elementary sehools in the Chieago Publie 
Sehool Distriet and find no overall improvement in aeademie aehievement among lottery 
winners eompared to lottery losers. Similarly, Hastings et al. (2006) study the introduetion of 
open enrollment in the Charlotte-Meeklenburg Publie Sehool Distriet and also report no overall 
gains among lottery winners. 

Other sourees of ehoiee also provide some evidenee on the potential for eompetition to 
improve publie sehools. As noted earlier, perhaps the largest potential souree of eompetition 
between publie sehools arises beeause sehool quality already faetors heavily into residential 
ehoiees. While there is no direet evidenee on whether publie sehools respond to sueh ehoiee, 
Hoxby (2000, 2007) attempt to assess whether publie sehool students in metropolitan areas 
where there are many sehool distriets (and henee mueh residential ehoiee) perform better than 
publie sehool students in metropolitan areas where there are fewer sehool distriets. Beeause the 
size of sehool distriets may be endogenous, she employs an instrumental variables strategy using 
the number of rivers in the metropolitan area as an instrument for the eoneentration of distriets in 
an area. While Hoxby (2000, 2007) eoneludes that eompetitive pressure, indeed, improves 
publie sehool student aehievement, Rothstein (2007) finds that her results are sensitive to the 
manner in whieh the instrumental variable is eonstrueted. 

The rapid growth of eharter sehools provides another means of studying the potential 
impaet of eompetition on traditional publie sehools. Bettinger (2005), Bilfuleo & Ladd (2006), 

Using a struetural model to identify parental preferenees, Hastings et al. (2006) 
eonelude that aeademieally-oriented families benefit from sehool ehoiee. Hastings and 
Weinstein (fortheoming) also reaeh this eonelusion based on a randomized experiment in whieh 
they manipulated the presentation of information available to parents on sehool test seores. 
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and Sass (2006) attempt to estimate whether an inerease in the number of charter schools near 

traditional public schools improves the achievement of students in the traditional public schools. 

Bettinger (2005) and Bilfulco and Ladd (2006) find no evidence that the achievement of students 

who remain in the nearby traditional public schools improves with the presence of charter 

20 

schools, although Sass (2006) finds some evidence for improvement in math achievement. 

Overall, other forms of school choice do not provide strong evidence that students who 
exercise their choice experience achievement gains. Further, the weight of the evidence suggests 
that these other forms of school choice do not induce public schools to improve either. That 
said, the research on charter schools, in particular, is relatively new (as is the sector); as the 
schools mature and become more established within communities both their effectiveness and 
their “threat” to the local public schools as a viable alternative may increase. 



Might School Vouchers he a Cost-Neutral Way to Increase Social Welfare? 

While the literature on achievement gains does not find wholesale improvement from 
voucher programs, vouchers may nonetheless make sense from a cost-benefit perspective, 
particularly if one broadens the potential criteria on which to judge them. First, one might 
support vouchers as a way of promoting greater equity by providing poor families more 
opportunities for opting out of the public system - such as those currently enjoyed by wealthier 
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Bettinger (2005) attempts to account for the fact that the location of charter schools 
may not be exogenous by taking advantage of institutional details in the development of such 
schools in Michigan. Both Bilfulco & Ladd (2006) and Sass (2006) attempt to do so by 
including fixed effects that reflect student enrollment spells in a particular school (such that the 
impact of the local competition is identified by comparing students in the same school as the 
level of competition changes). 
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families. Second, one consistent finding in the literature is that voucher parents report being 
more satisfied with their current schooling than non-voucher parents. For example, in the 
DCOSP parents of students offered a voucher gave their child’s school a significantly higher 
overall grade on a five-point scale (grades “A” through “F”) and were significantly more likely 
to give their child’s school a grade of “A” or “B.” Further, they reported significantly greater 
satisfaction with their child’s school on all dimensions asked, including location, class sizes, 
discipline, academic quality, and the racial mix of the students (Wolf et al. 2007). These results 
have also generally been reported for other voucher programs such as those in New York City 
(Mayer et al. 2002) and Milwaukee (Witte et al. 1995).^^ 

If one considers gains in equity and increased parental satisfaction, then introducing 
vouchers could increase social welfare if vouchers are no more expensive than our current 
system of public education. This potential net improvement in social welfare depends on both the 
general equilibrium effects of vouchers and the cost advantage over current public schools, two 
issues that are not well understood. While small-scale voucher programs indicate that parents 
offered a voucher are more satisfied with their child’s school than those not offered a voucher, a 
large scale voucher program might generate some parents who are more satisfied and some who 



2 1 

At the same time, not all parents are satisfied with the voucher schools. Focus groups 
from DCOSP participants found that parents believed a few schools misrepresented aspects of 
their program and that there was a need for an evaluation of participating schools (Stewart et al. 
2007). Similarly, in the early years of the Milwaukee Parental Choice Program, 43% of the 
parents who took their children out of the voucher schools cited the quality of the voucher school 
as one of the primary reasons for withdrawal; including being unhappy with the staff, the 
education their child was receiving, a lack of programs for special needs, and that the teachers 
were too disciplinarian. Thirty percent cited the quality of the program, including hidden school 
fees, difficulties with transportation, and the limitation on religious instruction (Witte et al. 
1995). 
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are less satisfied. In order for soeial welfare to be inereased with a eost-neutral voueher 
program, the benefits to the parents made better off must be large enough to outweigh the losses 
to parents made worse off. 

Additionally, it is not elear that a well-developed voueher program would be eost-neutral. 
On its face an education voucher system should be no more expensive than the current system as 
the state (or other public entity) would simply send a voucher check to schools for each 
participating child rather than to the local public school or district. However, if implemented on 
a large scale, there may be other, less appreciated costs that would depend critically on the 
design of the program. Levin & Driver (1997) caution that depending on how a program deals 
with students currently attending private schools, the transportation of children to and from 
school, record keeping and monitoring of student enrollment, and the process of adjudicating 
disputes (particularly if there are differing voucher amounts), the cost of a voucher system could 
actually exceed those of the current geographically-based system. While their estimates are 
rough - based on hypothetical voucher programs and crudely estimated costs - their analysis 
suggests, at a minimum, that we should not assume a voucher program would be cost-neutral. 
Further, there may be large costs associated with the transition to a voucher system that should 
be considered. 

Why has it been so difficult to observe large improvements in student achievement? 

Why might vouchers (or competition in general) not generate large improvements in 
student achievement? One explanation may be that the public sector is not as inefficient as many 
perceive because schools already compete for students through residential choice (see, e.g.. 
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Barrow & Rouse 2004). Another explanation may be that the edueation sector does not meet the 
conditions for perfect competition to result in an efficient outcome (Garner & Hannaway 1982). 
For one, information on school quality may be costly and difficult for parents to obtain. 
Obviously, any potential academic gains from additional choice cannot be realized if consumers 
do not have the information on which to make informed decisions. Further, education is not a 
homogenous good. While competition for students may make schools more responsive to 
parents, this may be achieved through changes in other dimensions, such as religious education 
or nicer gymnasiums, rather than academic achievement. A growing literature is attempting to 
understand what kind of information is available to parents or conversely, whether one can 
improve it or make it more transparent (see, e.g., Hastings & Weinstein, forthcoming). 
Similarly, several recent studies have attempted to better understand the extent to which parents 
- particularly low-income parents who would most likely be offered school vouchers - factor a 
school’s academic quality into their decision-making process (Hanushek et al. 2007; Hastings et 
al. 2005, 2006; Hastings & Weinstein forthcoming; and Jacob & Lefgren 2007). Unfortunately, 
the findings are mixed. 

In addition, the studies to date necessarily focus on the short-run effects of vouchers 
when, in fact, there may be longer-run impacts on high school graduation, college enrollment, or 
even future earnings. For example, Altonji et al. (2005b) study the effect of Catholic education 
on a variety of outcomes and find little evidence that Catholic schools raise student test scores. 
At the same time, their results suggest that Catholic schools increase the probability of 
graduating from high school and potentially the probability of enrolling in college. These longer- 
run effects have yet to be credibly examined in studies of school vouchers. 
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CONCLUSION 

Milton Friedman’s dream of a publiely-funded - but not neeessarily publiely-provided - 
sehool system where parents have a ehoiee of many different sehools for their ehildren has never 
been tested in the U.S. And yet, its theoretieal appeal has led to several, mostly small-scale, 
attempts to determine whether students might benefit from such a reform. Unfortunately, results 
from these small programs cannot test Friedman’s hypotheses. The most credible evidence 
comes from studies focused on the short-run academic gains for students who use vouchers. As 
a result, many questions remain about the potential long-run impacts on academic outcomes and 
about both the public and private sector responses to a large, permanent, and well-funded 
voucher program. 

Keeping these limitations in mind, the best research to date finds relatively small 
achievement gains for students offered education vouchers, most of which are not statistically 
different from zero. Further, what little evidence exists about the likely impact of a large-scale 
voucher program on the students who remain in the public schools is at best mixed, and the 
research designs of these studies do not necessarily allow the researchers to attribute any 
observed positive gains solely to school vouchers and competitive forces. The evidence to date 
from other forms of school choice is not much more promising. As such, while there may be 
other reasons to implement school voucher programs, one should not anticipate large academic 
gains from this seemingly inexpensive reform. 
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Table 1 : Estimated test score impacts of publicly-financed voucher programs 



Voucher Program 


Notes 


Math 




Reading 






ITT 


TT 


ITT TT 


Milwaukee Parental Choice 
Program from Rouse (1998) 


Annual test score growth of students 
in grades K-8 selected to receive a 
voucher relative to unsuccessful 
applicants and low-income students 
in Milwaukee Public Schools; with 
and without student fixed effects. 


0.06* to 
0.11** 


0.14** 


-0.03 to 0.03 0.01 


Cleveland Scholarship and Tutoring 
Program from Belfield (2007) 


Grade 2 test score gains of students 
selected to receive a voucher 
relative to unsuccessful applicants. 


-0.11** 


-0.11** 


-0.13** -0.13** 


Cleveland Scholarship and Tutoring 
Program from Belfield (2007) 


Grade 4 test score gains of students 
selected to receive a voucher 
relative to unsuccessful applicants. 


-0.02 


-0.08 


0.04 0.07 


D.C. Opportunity Scholarship 
Program from Wolf et al. (2007) 


Randomized experiment comparing 
lottery winners in grades K-12 to 
non-recipients in the first year. 


-0.01 to 
0.07* 




-0.01 to 0.03 


D.C. Opportunity Scholarship 
Program from Wolf et al. (2008) 


Randomized experiment comparing 
lottery winners in grades K-12 to 
non-recipients in the second year. 


-0.02 to 0.01 




0.05 to 0.08* 



Notes: Reported estimates have been eonverted to effect sizes in national standard deviation units. Estimates from Rouse 
(2008) are from Table VI col. (1) and Table Va col. (2) for math and from Table Vb cols. (7)-(8) for reading. Clive Belfield 
generously provided the intent-to -treat estimates for Cleveland reported above as well as within sample standard deviation 
information used to convert effect sizes to those based on national standard deviation units from CTB/McGraw-Hill, 2001. 
Estimates of the effect of treatment-on-the -treated come from Belfield (2007) Table 3, panel C, col. (1) and (2) and Table 6, 
panel C, col. (1) and (2). Estimates from Wolf et al. (2007) are from Tables H-1 and H-2, Eull sample; those from Wolf et 
al. (2008) are from Tables D-1 and D-2, Eull sample. Eor D.C. we use the average national standard deviation over grades 
K through 12 reported in Stanford Achievement Test Series (1996) along with those reported in Wolf, et al. (2007) to 
convert effect sizes. ITT is "Intent- to-Treat" and TT is "Treatment-on-the-Treated." Statistical significance levels are 
reported as: *** = 1 percent; ** = 5 percent; * = 10 percent. 
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Table 2: Estimated test score impacts for privately-financed voucher programs after three years 



Study 


Voucher Program 


Notes 


All Students 


African American Students 








ITT 


TT 


ITT 


TT 






Randomized experiment 
comparing lottery winners 
and voucher users to non- 










Mayer, et al. (2002) 


New York City 


reclplents. TT estimates 
reflect the gains to 
attending private school for 
at least one year. 


0.03 


0.05 


0.19*** 


0.26*** 






Randomized experiment 
comparing lottery winners 
and users to non- 










Krueger & Zhu (2004) 


New York City 


reclplents. TT estimates 
reflect the gains to an 
additional year In private 
school. 


-0.01 to 0.01 


0.00 


0.05 


0.03 






Randomized experiment 
comparing voucher users 
to non-reciplents. 
Estimates reflect the gains 










Howell & Peterson (2002) 


Two-cIty average 


to attending private school 
for at least one year. 




0.02 




0.23*** 



Notes: The two-eity average is for New York City and Washington, D.C. National pereentile rank impacts were converted to 
effect sizes in national standard deviation units using a standard deviation of 28.5. Estimates for Mayer, et al. (2002) come from 
Table 20, col. (3) and (6). ITT estimates from Krueger & Zhu (2004) are from Tables 4 and 5, Third follow-up test (using the 
broadest definition of African American); TT estimates are from Table 6, Third follow-up test (using the broadest definition of 
African American). Howell & Peterson (2002) estimates are from Table 6-1, Year III. ITT is "Intent-to-Treat" and TT is 
"Treatment-on-the-Treated." 

Statistical significance levels are reported as: *** = 1 percent; ** = 5 percent; * = 10 percent. 
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Table 3: Estimated test score impacts from expansion of the Milwaukee voucher program 





Study 


Notes 


Time period 


Math 


Language 


(1) 


Hoxby (2003) 


Annual test score growth of students in most- 
treated schools relative to students in 
untreated comparison schools (outside 
Milwaukee). 


1996-97 to 1999-2000 


0.12 ** 


0.11 ** 


(2) 


Carnoy et al. 


(2007) 


Average program impact on test score gain of 
students in lowest income schools relative to 
students in comparison schools outside 
Milwaukee. 


1998-99 to 2001-02 


0.22 ** 


0.16 ** 


(3) 


Carnoy et al. 


(2007) 


Average program impact on test score gain of 
students in lowest income schools relative to 
students in comparison schools outside 
Milwaukee. 


1998-99 to 2004-05 


0.22 ** 


0.16 ** 


(4) 


Carnoy et al. 


(2007) 


Test score gains per voucher place within 1 
mile/enrollment in 2001-02. 


2001-02 


-0.04 


-0.01 


(5) 


Carnoy et al. 


(2007) 


Test score gains per average voucher 
application/enrollment 1 998-2001 . 


2001-02 


-0.12 


-1.54 


(6) 


Chakrabarti (2008) 


Program impact on test scores of more 
treated schools relative to comparison 
schools outside Milwaukee. 


2001-02 


0.15 ** 


0.24 *** 



Notes: All estimates apply to test scores for students in the 4th grade. Estimates from Hoxby (2003) are derived from Table 
8.8 and are converted to standard deviation units using the national percentile rank standard deviation of 28.5. Estimates from 
Carnoy et al. (2007) come from Tables 3 and 9. Table 3 estimates are converted to standard deviation unites using the national 
Terra Nova standard deviations for 4th grade of 39.32 in math and 36.27 in language. Table 9 estimates are converted to 
standard deviation units using the normal curve equivalent standard deviation of 21.05. Estimates from Chakrabarti (2008) 
come from Table 12, Panel C, and are converted to national standard deviation units using within sample standard deviations 
reported and the national standard deviation of 21.06 for normal curve equivalent scores. 

Statistical significance levels are reported as: *** = 1 percent; ** = 5 percent. 
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Table 4: Estimated test score impacts of receipt of "F" grade from Florida's A+ Plan for Education 



Study 


Notes 




Math 




Reading 






Year 1 


Years 


Year 1 


Years 


Rouse et al (2007) 


Regression discontinuity estimates 
reflecting the impact of receiving an 
"F" grade (controls for school fixed 
effects). 


0.212*** 


0.118*** 


0.140*** 


0.088*** 


Chiang (2008) 


Regression discontinuity estimates 
reflecting the impact of receiving an 
"F" grade (Year 3 estimates control 
for observable school 
characteristics). 


0.118** 


0.084* 


0.112** 


0.030 



Notes: All estimates based on the FCAT Reading and Math ("high-stakes") tests. Estimates from Rouse et al. 
(2007) eome from Table 4 rows labeled "2002-03 eohort eompared with 2001-02 eohort" in 1st and 2nd panels. 
Chiang (2008) estimates eome from Tables 6 and 7 rows labeled "All aeeountable students" ineluding middle sehool 
eontrols (for the Year 3 estimates). All coefficients have been normalized by the standard deviation of test scores of 
students in Florida by grade. 

Statistical significance levels are reported as: *** = 1 percent; ** = 5 percent; * = 10 percent. 
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